Counting occurrences of some subword patterns

نویسندگان

  • Alexander Burstein
  • Toufik Mansour
چکیده

Counting the number of words which contain a set of given strings as substrings a certain number of times is a classical problem in combinatorics. This problem can, for example, be attacked using the transfer matrix method (see [20, Section 4.7]). In particular, it is a well-known fact that the generating function of such words is always rational. For example, in [20, Example 4.7.5] it is shown that the generating function for the number of words in [3]n where neither 11 nor 23 appear as two consecutive digits is given by 3 + x− x2 1−2x− x2 + x3 . In this paper, we present, in several cases, a complete solution for the problem of the enumeration of words containing a subword pattern (see below for the precise definition) of length l exactly r times. For example, we find the number of words in [3]n containing the subword pattern 111 exactly r times, that is, the number of words which contain 111, 222, and 333 as substrings a total of r times. Régnier and Szpankowski [18] used a combinatorial approach to study the frequency of occurrences of strings (which they also called a “pattern”) from a given set in a random word, when overlapping copies of the “patterns” are counted separately (see [18, Theorem 2.1]). We note that the term “pattern” in [18] is used to denote an exact string rather than its type with respect to order isomorphism. For example, the “pattern” 112 in [18] is the actual string 112, whereas in our setting an occurrence of the subword pattern 112 is any substring aab of the ambient string with a < b. Although, in principle, it is possible to deduce our results from the result by Régnier and Szpankowski, our direct derivations are much simpler.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Counting Occurrences of Some Subword Patterns Alexander Burstein and Toufik Mansour

We find generating functions for the number of strings (words) containing a specified number of occurrences of certain types of order-isomorphic classes of substrings called subword patterns. In particular, we find generating functions for the number of strings containing a specified number of occurrences of a given 3-letter subword pattern.

متن کامل

Counting subwords in flattened partitions of sets

In this paper, we consider the problem of avoidance of subword patterns in flattened partitions, which extends recent work of Callan. We determine in all cases explicit formulas and/or generating functions for the number of set partitions of size n which avoid a single subword pattern of length three. The asymptotic behavior of the resulting counting sequences turns out to depend quite heavily ...

متن کامل

Counting Subwords in a Partition of a Set

A partition π of the set [n] = {1, 2, . . . , n} is a collection {B1, . . . , Bk} of nonempty disjoint subsets of [n] (called blocks) whose union equals [n]. In this paper, we find explicit formulas for the generating functions for the number of partitions of [n] containing exactly k blocks where k is fixed according to the number of occurrences of a subword pattern τ for several classes of pat...

متن کامل

Modular and Threshold Subword Counting and Matrix Representations of Finite Monoids

The subword relation reveals interesting combinatorial properties and plays a prominent role in formal language theory. For instance, recall that languages consisting of all words over Σ having a given word u ∈ Σ∗ as a subword serve as a generating system for the Boolean algebra of so-called piecewise testable languages. It was a deep study of combinatorics of the subword relation that led Simo...

متن کامل

On the structure of compacted subword graphs of Thue-Morse words and their applications

We investigate how syntactic properties of Thue-Morse words are related to special type of automata/graphs. The directed acyclic subword graph (dawg, in short) is a useful deterministic automaton accepting all su xes of the word. Its compacted version (resulted by compressing chains of states) is denoted by cdawg. The cdawgs of Thue-Morse words have regular and very simple structure, in particu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Discrete Mathematics & Theoretical Computer Science

دوره 6  شماره 

صفحات  -

تاریخ انتشار 2003